Goto

Collaborating Authors

 Monterrey


Advancing Multi-Step Mathematical Reasoning in Large Language Models through Multi-Layered Self-Reflection with Auto-Prompting

Loureiro, André de Souza, Valverde-Rebaza, Jorge, Noguez, Julieta, Escarcega, David, Marcacini, Ricardo

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have significantly improved their problem-solving capabilities. However, these models still struggle when faced with complex multi-step reasoning tasks. In this paper, we propose the Multi-Layered Self-Reflection with Auto-Prompting (MAPS) framework, a novel approach designed to enhance multi-step mathematical reasoning in LLMs by integrating techniques such as Chain of Thought (CoT), Self-Reflection, and Auto-Prompting. Unlike traditional static prompting methods, MAPS employs an iterative refinement process. Initially, the model generates a solution using CoT prompting. When errors are detected, an adaptive self-reflection mechanism identifies and analyzes them, generating tailored prompts to guide corrections. These dynamically adjusted prompts enable the model to iteratively refine its reasoning. Experiments on four well-established benchmarks across multiple LLMs show that MAPS significantly outperforms standard CoT and achieves competitive results with reasoning-optimized models. In addition, MAPS enables general-purpose LLMs to reach performance levels comparable to specialized reasoning models. While deeper reflection layers improve accuracy, they also increase token usage and costs. To balance this trade-off, MAPS strategically limits reflection depth, ensuring an optimal balance between cost and reasoning performance.


MammoRGB: Dual-View Mammogram Synthesis Using Denoising Diffusion Probabilistic Models

Garza-Abdala, Jorge Alberto, Fumagal-González, Gerardo A., Avendano, Daly, Cardona, Servando, Hussain, Sadam, de Avila-Armenta, Eduardo, Toscano-Martínez, Jasiel H., Gurmendi, Diana S. M. Rosales, Pedro-Pérez, Alma A., Tamez-Pena, Jose Gerardo

arXiv.org Artificial Intelligence

Purpose: This study aims to develop and evaluate a three channel denoising diffusion probabilistic model (DDPM) for synthesizing single breast dual view mammograms and to assess the impact of channel representations on image fidelity and cross view consistency. Materials and Methods: A pretrained three channel DDPM, sourced from Hugging Face, was fine tuned on a private dataset of 11020 screening mammograms to generate paired craniocaudal (CC) and mediolateral oblique (MLO) views. Three third channel encodings of the CC and MLO views were evaluated: sum, absolute difference, and zero channel. Each model produced 500 synthetic image pairs. Quantitative assessment involved breast mask segmentation using Intersection over Union (IoU) and Dice Similarity Coefficient (DSC), with distributional comparisons against 2500 real pairs using Earth Movers Distance (EMD) and Kolmogorov Smirnov (KS) tests. Qualitative evaluation included a visual Turing test by a non expert radiologist to assess cross view consistency and artifacts. Results: Synthetic mammograms showed IoU and DSC distributions comparable to real images, with EMD and KS values (0.020 and 0.077 respectively). Models using sum or absolute difference encodings outperformed others in IoU and DSC (p < 0.001), though distributions remained broadly similar. Generated CC and MLO views maintained cross view consistency, with 6 to 8 percent of synthetic images exhibiting artifacts consistent with those in the training data. Conclusion: Three channel DDPMs can generate realistic and anatomically consistent dual view mammograms with promising applications in dataset augmentation.


When and How to Express Empathy in Human-Robot Interaction Scenarios

Cruz, Christian Arzate, Montiel-Vazquez, Edwin C., Maeda, Chikara, Gomez, Randy

arXiv.org Artificial Intelligence

Abstract-- Incorporating empathetic behavior into robots can improve their social effectiveness and interaction quality. In this paper, we present whEE (when and how to express empathy), a framework that enables social robots to detect when empathy is needed and generate appropriate responses. Using large language models, whEE identifies key behavioral empathy cues in human interactions. We evaluate it in human-robot interaction scenarios with our social robot, Haru. Results show that whEE effectively identifies and responds to empathy cues, providing valuable insights for designing social robots capable of adaptively modulating their empathy levels across various interaction contexts. In most scenarios, Large Language Models (LLMs) represent the state-of-the-art approach for classifying empathy [1], [2] and generating empathetic responses [3], [4]. However, the development of robots capable of dynamically adjusting their level of empathy based on the context remains an underexplored area [5]. To this end, we introduce whEE (when and how to express empathy), an empathy framework that provides guidelines on when robots should respond empathetically and how to achieve it. Using our framework, we analyze the utterances of speakers and listeners in dyadic and group conversations with varying levels of empathy. Our analysis identifies key empathy cues that indicate when a speaker seeks an empathetic response and the cues exhibited by listeners displaying high levels of empathy. We approach empathy by focusing on observable behaviors that individuals exhibit when demonstrating an understanding of others' emotions and engaging deeply with their experiences--referred to as behavioral empathy [6].


Detecting Hope Across Languages: Multiclass Classification for Positive Online Discourse

Abiola, T. O., Abiodun, K. D., Olumide, O. E., Adebanji, O. O., Calvo, O. Hiram, Sidorov, Grigori

arXiv.org Artificial Intelligence

The detection of hopeful speech in social media has emerged as a critical task for promoting positive discourse and well-being. In this paper, we present a machine learning approach to multiclass hope speech detection across multiple languages, including English, Urdu, and Spanish. We leverage transformer-based models, specifically XLM-RoBERTa, to detect and categorize hope speech into three distinct classes: Generalized Hope, Realistic Hope, and Unrealistic Hope. Our proposed methodology is evaluated on the PolyHope dataset for the PolyHope-M 2025 shared task, achieving competitive performance across all languages. We compare our results with existing models, demonstrating that our approach significantly outperforms prior state-of-the-art techniques in terms of macro F1 scores. We also discuss the challenges in detecting hope speech in low-resource languages and the potential for improving generalization. This work contributes to the development of multilingual, fine-grained hope speech detection models, which can be applied to enhance positive content moderation and foster supportive online communities.


Multilingual Hope Speech Detection: A Comparative Study of Logistic Regression, mBERT, and XLM-RoBERTa with Active Learning

Abiola, T. O., Abiodun, K. D., Olumide, O. E., Adebanji, O. O., Calvo, O. Hiram, Sidorov, Grigori

arXiv.org Artificial Intelligence

Hope speech language that fosters encouragement and optimism plays a vital role in promoting positive discourse online. However, its detection remains challenging, especially in multilingual and low-resource settings. This paper presents a multilingual framework for hope speech detection using an active learning approach and transformer-based models, including mBERT and XLM-RoBERTa. Experiments were conducted on datasets in English, Spanish, German, and Urdu, including benchmark test sets from recent shared tasks. Our results show that transformer models significantly outperform traditional baselines, with XLM-RoBERTa achieving the highest overall accuracy. Furthermore, our active learning strategy maintained strong performance even with small annotated datasets. This study highlights the effectiveness of combining multilingual transformers with data-efficient training strategies for hope speech detection.


To Explain Or Not To Explain: An Empirical Investigation Of AI-Based Recommendations On Social Media Platforms

Haque, AKM Bahalul, Islam, A. K. M. Najmul, Mikalef, Patrick

arXiv.org Artificial Intelligence

AI based social media recommendations have great potential to improve the user experience. However, often these recommendations do not match the user interest and create an unpleasant experience for the users. Moreover, the recommendation system being a black box creates comprehensibility and transparency issues. This paper investigates social media recommendations from an end user perspective. For the investigation, we used the popular social media platform Facebook and recruited regular users to conduct a qualitative analysis. We asked participants about the social media content suggestions, their comprehensibility, and explainability. Our analysis shows users mostly require explanation whenever they encounter unfamiliar content and to ensure their online data security. Furthermore, the users require concise, non-technical explanations along with the facility of controlled information flow. In addition, we observed that explanations impact the users perception of transparency, trust, and understandability. Finally, we have outlined some design implications and presented a synthesized framework based on our data analysis.


Nearest-Better Network for Visualizing and Analyzing Combinatorial Optimization Problems: A Unified Tool

Diao, Yiya, Li, Changhe, Zeng, Sanyou, Cai, Xinye, Luo, Wenjian, Yang, Shengxiang, Coello, Carlos A. Coello

arXiv.org Artificial Intelligence

The Nearest-Better Network (NBN) is a powerful method to visualize sampled data for continuous optimization problems while preserving multiple landscape features. However, the calculation of NBN is very time-consuming, and the extension of the method to combinatorial optimization problems is challenging but very important for analyzing the algorithm's behavior. This paper provides a straightforward theoretical derivation showing that the NBN network essentially functions as the maximum probability transition network for algorithms. This paper also presents an efficient NBN computation method with logarithmic linear time complexity to address the time-consuming issue. By applying this efficient NBN algorithm to the OneMax problem and the Traveling Salesman Problem (TSP), we have made several remarkable discoveries for the first time: The fitness landscape of OneMax exhibits neutrality, ruggedness, and modality features. The primary challenges of TSP problems are ruggedness, modality, and deception. Two state-of-the-art TSP algorithms (i.e., EAX and LKH) have limitations when addressing challenges related to modality and deception, respectively. LKH, based on local search operators, fails when there are deceptive solutions near global optima. EAX, which is based on a single population, can efficiently maintain diversity. However, when multiple attraction basins exist, EAX retains individuals within multiple basins simultaneously, reducing inter-basin interaction efficiency and leading to algorithm's stagnation.


Automated planning with ontologies under coherence update semantics (Extended Version)

Borgwardt, Stefan, Nhu, Duy, Röger, Gabriele

arXiv.org Artificial Intelligence

Standard automated planning employs first-order formulas under closed-world semantics to achieve a goal with a given set of actions from an initial state. We follow a line of research that aims to incorporate background knowledge into automated planning problems, for example, by means of ontologies, which are usually interpreted under open-world semantics. We present a new approach for planning with DL-Lite ontologies that combines the advantages of ontology-based action conditions provided by explicit-input knowledge and action bases (eKABs) and ontology-aware action effects under the coherence update semantics. We show that the complexity of the resulting formalism is not higher than that of previous approaches and provide an implementation via a polynomial compilation into classical planning. An evaluation of existing and new benchmarks examines the performance of a planning system on different variants of our compilation.


An efficient plant disease detection using transfer learning approach

Sambana, Bosubabu, Nnadi, Hillary Sunday, Wajid, Mohd Anas, Fidelia, Nwosu Ogochukwu, Camacho-Zuñiga, Claudia, Ajuzie, Henry Dozie, Onyema, Edeh Michael

arXiv.org Artificial Intelligence

Plant diseases pose significant challenges to farmers and the agricultural sector at large. However, early detection of plant diseases is crucial to mitigating their effects and preventing widespread damage, as outbreaks can severely impact the productivity and quality of crops. With advancements in technology, there are increasing opportunities for automating the monitoring and detection of disease outbreaks in plants. This study proposed a system designed to identify and monitor plant diseases using a transfer learning approach. Specifically, the study utilizes YOLOv7 and YOLOv8, two state-ofthe-art models in the field of object detection. By fine-tuning these models on a dataset of plant leaf images, the system is able to accurately detect the presence of Bacteria, Fungi and Viral diseases such as Powdery Mildew, Angular Leaf Spot, Early blight and Tomato mosaic virus. The model's performance was evaluated using several metrics, including mean Average Precision (mAP), F1-score, Precision, and Recall, yielding values of 91.05, 89.40, 91.22, and 87.66, respectively. The result demonstrates the superior effectiveness and efficiency of YOLOv8 compared to other object detection methods, highlighting its potential for use in modern agricultural practices. The approach provides a scalable, automated solution for early any plant disease detection, contributing to enhanced crop yield, reduced reliance on manual monitoring, and supporting sustainable agricultural practices.


Comparison of ConvNeXt and Vision-Language Models for Breast Density Assessment in Screening Mammography

Molina-Román, Yusdivia, Gómez-Ortiz, David, Menasalvas-Ruiz, Ernestina, Tamez-Peña, José Gerardo, Santos-Díaz, Alejandro

arXiv.org Artificial Intelligence

--Mammographic breast density classification is essential for cancer risk assessment but remains challenging due to subjective interpretation and inter-observer variability. This study compares multimodal and CNN-based methods for automated classification using the BI-RADS system, evaluating BioMedCLIP and ConvNeXt across three learning scenarios: zero-shot classification, linear probing with textual descriptions, and fine-tuning with numerical labels. Results show that zero-shot classification achieved modest performance, while the fine-tuned ConvNeXt model outperformed the BioMedCLIP linear probe. Although linear probing demonstrated potential with pretrained embeddings, it was less effective than full fine-tuning. These findings suggest that despite the promise of multimodal learning, CNN-based models with end-to-end fine-tuning provide stronger performance for specialized medical imaging. The study underscores the need for more detailed textual representations and domain-specific adaptations in future radiology applications. Accurate breast density classification plays a critical role in assessing breast cancer risk.